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Abstract 

Situations where individuals have to contribute to joint efforts or share scarce resources are ubiquitous. Yet, 
without proper mechanisms to ensure cooperation, the evolutionary pressure to maximize individual success 
tends to create a tragedy of the commons (such as over-fishing or the destruction of our environment). This 
contribution addresses a number of related puzzles of human behavior with an evolutionary game theoretical 
approach as it has been successfully used to explain the behavior of other biological species many times, from 
bacteria to vertebrates. Our agent-based model distinguishes individuals applying four different behavioral 
strategies: non-cooperative individuals ("defectors"), cooperative individuals abstaining from punishment ef- 
forts (called "cooperators" or "second-order free-riders" ), cooperators who punish non-cooperative behavior 
("moralists"), and defectors, who punish other defectors despite being non-cooperative themselves ("immoral- 
ists"). By considering spatial interactions with neighboring individuals, our model reveals several interesting 
effects: First, moralists can fully eliminate cooperators. This spreading of punishing behavior requires a segre- 
gation of behavioral strategies and solves the "second-order free-rider problem" . Second, the system behavior 
changes its character significantly even after very long times ("who laughs last laughs best effect"). Third, the 
presence of a number of defectors can largely accelerate the victory of moralists over non-punishing coopera- 
tors. Forth, in order to succeed, moralists may profit from immoralists in a way that appears like an "unholy 
collaboration" . Our findings suggest that the consideration of punishment strategies allows to understand the 
establishment and spreading of "moral behavior" by means of game-theoretical concepts. This demonstrates 
that quantitative biological modeling approaches are powerful even in domains that have been addressed with 
non-mathematical concepts so far. The complex dynamics of certain social behaviors becomes understandable 
as result of an evolutionary competition between different behavioral strategies. 

Author Summary 

Why do friends spontaneously come up with mutually accepted rules, cooperation, and solidarity, while the 
creation of shared moral standards often fails in large communities? In a "global village" , where everybody may 
interact with anybody else, it is not worthwhile to punish people who cheat. Moralists (cooperative individuals 
who undertake punishment efforts) disappear because of their disadvantage compared to cooperators who 
do not punish (so-called "second-order free-riders" ). However, cooperators are exploited by free-riders. This 
creates a "tragedy of the commons" , where everybody is uncooperative in the end. Yet, when people interact 
with friends or local neighbors, as most people do, moralists can escape the direct competition with non- 
punishing cooperators by separating from them. Moreover, in the competition with free-riders, moralists 
can defend their interests better than non-punishing cooperators. Therefore, while seriously depleted in the 
beginning, moralists can finally spread all over the world ("who laughs last laughs best effect"). Strikingly, the 
presence of a few non-cooperative individuals ("deviant behavior") can accelerate the victory of moralists. In 
order to spread, moralists may also form an "unholy cooperation" with people having double moral standards, 
i.e. free-riders who punish non-cooperative behavior, while being uncooperative themselves. 
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Introduction 

Public goods such as environmental resources or social benefits are particularly prone to exploitation by non- 
cooperative individuals ("defectors"), who try to increase their benefit at the expense of fair contributors or 
users, the "cooperators" . This implies a tragedy of commons [1J. It was proposed that costly punishment 
of non-cooperative individuals can establish cooperation in public goods dilemmas [2-8], and it is effective 
indeed [91-111]. Nonetheless, why would cooperators choose to punish defectors at a personal cost [12H14] ? One 
would expect that evolutionary pressure should eventually eliminate such "moralists" due to their extra costs 
compared to "second-order free-riders" (i.e. cooperators, who do not punish). These, however should finally 
be defeated by "free-riders" (defectors). To overcome this problem [151116) , it was proposed that cooperators 
who punish defectors (called "moralists" by us) would survive through indirect reciprocity [17], reputation 
effects [18] or the possibility to abstain from the joint enterprize [19-21] by "volunteering" [22,23]. Without 
such mechanisms, cooperators who punish will usually vanish. Surprisingly, however, the second-order free- 
rider problem is naturally resolved, without assuming additional mechanisms, if spatial or network interactions 
are considered. This will be shown in the following. 

In order to study the conditions for the disappearance of non-punishing cooperators and defectors, we 
simulate the public goods game with costly punishment, considering two cooperative strategies (C, M) and 
two defective ones (D, I). For illustration, one may imagine that cooperators (C) correspond to countries 
trying to meet the CO2 emission standards of the Kyoto protocol [24], and "moralists" (M) to cooperative 
countries that additionally enforce the standards by international pressure (e.g. embargoes). Defectors (D) 
would correspond to those countries ignoring the Kyoto protocol, and immoralists (I) to countries failing to 
meet the Kyoto standards, but nevertheless imposing pressure on other countries to fulfil them. According 
to the classical game-theoretical prediction, all countries would finally fail to meet the emission standards, 
but we will show that, in a spatial setting, interactions between the four strategies C, D, M, and I can 
promote the spreading of moralists. Other well-known public goods problems are over-fishing, the pollution 
of our environment, the creation of social benefit systems, or the establishment and maintenance of cultural 
institutions (such as a shared language, norms, values, etc.). 

Our simplified game-theoretical description of such problems assumes that cooperators (C) and moralists 
(M) make a contribution of 1 to the respective public good under consideration, while nothing is contributed 
by defectors (D) and "immoralists" (I), i.e. defectors who punish other defectors. The sum of all contributions 
is multiplied by a factor r reflecting synergy effects of cooperation, and the resulting amount is equally shared 
among the k + 1 interacting individuals. Moreover, moralists and immoralists impose a fine f3/k on each 
defecting individual (playing D or I), which produces an additional cost j/k per punished defector to them 
(see Methods for details). The division by k scales for the group size, but for simplicity, the parameter /3 is 
called the punishment fine and 7 the punishment cost. 

Given the same interaction partners, an immoralist never gets a higher payoff than a defector, but does 
equally well in a cooperative environment. Moreover, a cooperator tends to outperform a moralist, given the 
interaction partners are the same. However, a cooperator can do better than a defector when the punishment 
fine /3 is large enough. 

It is known that punishment in the public goods game and similar games can promote cooperation above a 
certain critical threshold of the synergy factor r [11II25| . Besides cooperators who punish defectors, Heckathorn 
considered "full cooperators" (moralists) and "hypocritical cooperators" (immoralists) [26] ■ For well-mixed 
interactions (where individuals interact with a representative rather than local strategy distribution), Eldakar 
and Wilson find that altruistic punishment (moralists) can spread, if second-order free-riders (non-punishing al- 
truists) are excluded, and that selfish punishers (immoralists) can survive together with altruistic non-punishers 
(cooperators), provided that selfish nonpunishers (defectors) are sufficiently scarce [27j . 

Besides well-mixed interactions, some researchers have also investigated the effect of spatial interactions 
[5l fTT1[28l[29] , since it is known that they can support the survival or spreading of cooperators [30] (but this is 
not always the case [311132] ). In this way, Brandt et al. discovered a coexistence of cooperators and defectors 
for certain parameter combinations [IT]. Compared to these studies, our model assumes somewhat different 
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replication and strategy updating rules. The main point, however, is that we have chosen long simulation 
times and scanned the parameter space more extensively, which revealed several new insights, for example, 
the possible coexistence of immoralists and moralists, even when a substantial number of defectors is present 
initially. When interpreting our results within the context of moral dynamics [33], our main discoveries for a 
society facing public goods games may be summarized as follows: 

1. Victory over second-order free-riders: Over a long enough time period, moralists fully eliminate coopera- 
tors, thereby solving the "second-order free-rider problem" . This becomes possible by spatial segregation 
of the two cooperative strategies C and M, where the presence of defectors puts moralists in a advan- 
tageous position, which eventually allows moralists to get rid of non-punishing cooperators. 

2. "Who laughs last laughs best effect" : Moralists defeat cooperators even when the defective strategies I 
and D are eventually eliminated, but this process is very slow. That is, the system behavior changes its 
character significantly even after very long times. This is the essence of the "who laughs last laughs best 
effect". The finally winning strategy can be in a miserable situation in the beginning, and its victory 
may take very long. 

3. "Lucifer's positive side effect" : By permanently generating a number of defectors, small mutation rates 
can considerably accelerate the spreading of moralists. 

4. "Unholy collaboration" of moralists with immoralists: Under certain conditions, moralists can survive by 
profiting from immoralists. This actually provides the first explanation for the existence of defectors, who 
hypocritically punish other defectors, although they defect themselves. The occurrence of this strange 
behavior is well-known in reality and even experimentally confirmed [341135] , 

These discoveries required a combination of theoretical considerations and extensive computer simulations on 
multiple processors over long time horizons. 

Results 

For well-mixed interactions, defectors are the winners of the evolutionary competition among the four behavioral 
strategies C, D, M, and I [36J, which implies a tragedy of the commons despite punishment efforts. The reason 
is that cooperators (second-order free-riders) spread at the cost of moralists, while requiring them for their 
own survival. 

Conclusions from computer simulations are strikingly different, if the assumption of well-mixed interactions 
is replaced by the more realistic assumption of spatial interactions. When cooperators and defectors interact in 
space [5i mi[3"7M44] . it is known that some cooperators can survive through spatial clustering [45|. However, it 
is not clear how the spatiotemporal dynamics and the frequency of cooperation would change in the presence 
of moralists and immoralists. Would spatial interactions be able to promote the spreading of punishment and 
thereby eliminate second-order free-riders? 

In order to explore this, we have scanned a large parameter space. Figure 1 shows the resulting state of the 
system as a function of the punishment cost 7 and punishment fine j3 after a sufficiently long transient time. 
If the fine-to-cost ratio f3/-f and the synergy factor r are low, defectors eliminate all other strategies. However, 
for large enough fines /3, cooperators and defectors are always eliminated, and moralists prevail (Fig. 1). 

At larger r values, when the punishment costs are moderate, we find a coexistence of moralists with de- 
fectors without any cooperators. To understand why moralists can outperform cooperators despite additional 
punishment costs, it is important to analyze the dynamics of spatial interactions. Starting with a homogeneous 
strategy distribution (Fig. 2a), the imitation of better- performing neighbors generates small clusters of indi- 
viduals with identical strategies (Fig. 2b). "Immoralists" die out quickly, while cooperators and moralists form 
separate clusters in a sea of defectors (Fig. 2c). The further development is determined by the interactions 
at the interfaces between clusters of different strategies (Figs. 2d-f). In the presence of defectors, the fate of 
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moralists is not decided by a direct competition with cooperators, but rather by the success of both cooperative 
strategies against invasion attempts by defectors. If the /3/7-ratio is appropriate, moralists respond better to 
defectors than cooperators. Indeed, moralists can spread so successfully in the presence of defectors that areas 
lost by cooperators are quickly occupied by moralists (supplementary Video SI). This indirect territorial battle 
ultimately leads to the extinction of cooperators (Fig. 2f), thus resolving the second-order free-rider problem. 

In conclusion, the presence of some conventional free-riders (defectors) supports the elimination of second- 
order free-riders. However, if the fine-to-cost ratio is high, defectors are eliminated after some time. Then, the 
final struggle between moralists and cooperators takes such a long time that cooperators and moralists seem 
to coexist in a stable way. Nevertheless, a very slow coarsening of clusters is revealed, when simulating over 
extremely many iterations. This process is finally won by moralists, as they are in the majority by the time the 
defectors disappear, while they happen to be in the minority during the first stage of the simulation (see Fig. 
2). We call this the "who laughs last laughs best effect". Since the payoffs of cooperators and moralists are 
identical in the absence of other strategies, the underlying coarsening dynamics is expected to agree with the 
voter model [46] , 

Note that there is always a punishment fine /3, for which moralists can outcompete all other strategies. 
The higher the synergy factor r, the lower the /3/7-ratio required to reach the prevalence of moralists. Yet, 
for larger values of r, the system behavior also becomes richer, and there are areas for small fines or high 
punishment costs, where clusters with different strategies can coexist (see Figs. lb-d). For example, we 
observe the coexistence of clusters of moralists and defectors (see Fig. 2 and supplementary Video SI) or of 
cooperators and defectors (see supplementary Video S2). 

Finally, for low punishment costs 7 but moderate punishment fines and synergy factors r (see Fig. Id), the 
survival of moralists may require the coexistence with "immoralists" (see Fig. 3 and supplementary Video S3). 
Such immoralists are often called "sanctimonious" or blamed for "double moral standards", as they defect 
themselves, while enforcing the cooperation of others (for the purpose of exploitation). This is actually the 
main obstacle for the spreading of immoralists, as they have to pay punishment costs, while suffering from 
punishment fines as well. Therefore, immoralists need small punishment costs 7 to survive. As cooperators 
die out quickly for moderate values of r, the survival of immoralists depends on the existence of moralists 
they can exploit, otherwise they cannot outperform defectors. Conversely, moralists benefit from immoralists 
by supporting the punishment of defectors. Note, however, that this mutually profitable interaction between 
moralists and immoralists, which appears like an "unholy collaboration" , is fragile: If j3 is increased, immoralists 
suffer from fines, and if 7 is increased, punishing becomes too costly. In both cases, immoralists die out, and 
the coexistence of moralists and immoralists breaks down. Despite this fragility, "hypocritical" defectors, who 
punish other defectors, are known to occur in reality. Their existence has even been found in experiments 
(34l[35]. Here, we have revealed conditions for their occurrence. 

Discussion 

In summary, the second-order free-rider problem finds a natural and simple explanation, without requiring 
additional assumptions, if the local nature of most social interactions is taken into account and punishment 
efforts are large enough. In fact, the presence of spatial interactions can change the system behavior so dra- 
matically that we do not find the dominance of free-riders (defectors) as in the case of well-mixed interactions, 
but a prevalence of moralists via a "who laughs last laughs best" effect (Fig. 2). Moralists can escape dis- 
advantageous kinds of competition with cooperators by spatial segregation. However, their triumph over all 
the other strategies requires the temporary presence of defectors, who diminish the cooperators (second-order 
free-riders). Finally, moralists can take over, as they have reached a superiority over cooperators (which is 
further growing) and as they can outcompete defectors (conventional free-riders). 

Our findings stress how crucial spatial or network interactions in social systems are. Their consideration 
gives rise to a rich variety of possible dynamics and a number of continuous or discontinuous transitions between 
qualitatively different system behaviors. Spatial interactions can even invert the finally expected system 
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behavior and, thereby, explain a number of challenging puzzles of social, economic, and biological systems. This 
includes the higher-than-expected level of cooperation in social dilemma situations, the elimination of second- 
order free-riders, and the formation of what looks like a collaboration between otherwise inferior strategies. 

By carefully scanning the parameter space, we found several possible kinds of coexistence between two 
strategies each: 

• Moralists (M) and defectors (D) can coexist, when the disadvantage of cooperative behavior is not 
too large (i.e. the synergy factor is high enough), and if the punishment fine is sufficiently large that 
moralists can survive among defectors, but not large enough to get rid of them. 

• Instead of M and D, moralists (M) and immoralists (I) coexist, when the punishment cost is small 
enough. The small punishment cost is needed to ensure that the disadvantage of punishing defectors (I) 
compared to non-punishing defectors (D) is small enough that it can be compensated by the additional 
punishment efforts contributed by moralists. 

• To explain the well-known coexistence of D and C it is useful to remember that defectors can be 
crowded out by cooperators, when the synergy factor exceeds a critical value (even when punishment 
is not considered). Slightly below this threshold, neither cooperators nor defectors have a sufficient 
advantage to get rid of the other strategy, which results in a coexistence of both strategies. 

Generally, a coexistence of strategies occurs, when the payoffs at the interface between clusters of different 
strategies are balanced. In order to understand why the coexistence is possible in a certain parameter area 
rather than just for an infinitely small parameter set, it is important to consider that typical cluster sizes 
vary with the parameter values. This also changes the typical radius of the interface between the coexisting 
strategies and, thereby, the typical number of neighbors applying the same strategy or a different one. In other 
words, a change in the shape of a cluster can partly counter-balance payoff differences between two strategies 
by varying the number of "friends" and "enemies" involved in the battle at the interface between spatial areas 
with different strategies (see Fig. |4}. 

Finally, we would like to discuss the robustness of our observations. It is well-known that the level of 
cooperation in the public goods game is highest in small groups [10| . However, we have found that moralists 
can crowd out non-punishing cooperators also for group sizes of fc + 1 = 9, 13, 21, or 25 interacting individuals, 
for example. In the limiting case of large groups, where everybody interacts with everybody else, we expect 
the outcome of the well-mixed case, which corresponds to defection by everybody (if other mechanisms like 
reputation effects [IT] or abstaining are not considered [20]). That is, the same mechanisms that can create 
cooperation among friends may fa/7 to establish shared moral standards, when spatial interactions are negligible. 
It would therefore be interesting to study, whether the fact that interactions in the financial system are global, 
has contributed to the financial crisis. Typically, when social communities exceed a certain size, they need 
sanctioning institutions to stabilize cooperation (such as laws, an executive system, and police). 

Note that our principal discoveries are not expected to change substantially for spatial interactions within 
irregular grids (i.e. neighborhoods different from Moore neighborhoods) |47j . In case of network interactions, 
we have checked that small-world or random networks lead to similar results, when the degree distribution is 
the same (see Fig. [5]). A heterogeneous degree distribution is even expected to reduce free-riding [37j (given 
the average degree is the same). Finally, adding other cooperation-promoting mechanisms to our model such 
as direct reciprocity (a shadow of the future through repeated interactions [49]), indirect reciprocity [17] (trust 
and reputation effects [TI][T8]), abstaining from a joint enterprize [I9l423| , or success-driven migration |50] , 
will strengthen the victory of moralists over conventional and second-order free-riders. 

In order to test the robustness of our observations, we have also checked the effect of randomness ( "noise" ) 
originating from the possibility of strategy mutations. It is known that mutations may promote cooperation 
[51] , According to the numerical analysis of the spatial public goods game with punishment, the introduction 
of rare mutations does not significantly change the final outcome of the competition between moralists and 
non-punishing cooperators. Second-order free-riders will always be a negligible minority in the end, if the 
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fine-to-cost ratio and mutation rate allows moralists to spread. While a large mutation rate naturally causes 
a uniform distribution of strategies, a low level of strategy mutations can be even beneficial for moralists. 
Namely, by permanently generating a number of defectors, small mutation rates can considerably accelerate 
the spreading of moralists, i.e. the slow logarithmic coarsening is replaced by another kind of dynamics [52] , 
Defectors created by mutations play the same role as in the D + M phase (see Figs. 1+2). They put moralists 
into an advantage over non-punishing cooperators, resulting in a faster spreading of the moralists (which 
facilitates the elimination of second-order free-riders over realistic time periods). In this way, the presence of 
a few "bad guys" (defectors) can accelerate the spreading of moral standards. Metaphorically speaking, we 
call this "lucifer's positive side effect". 

The current study paves the road for several interesting extensions. It is possible, for example, to study 
antisocial punishment [53], considering also strategies which punish cooperators [54]. The conditions for the 
survival or spreading of antisocial punishers can be identified by the same methodology, but the larger number 
of strategies creates new phases in the parameter space. While the added complexity transcends what can 
be discussed here, the current study demonstrates clearly how differentiated the moral dynamics in a society 
facing public goods problems can be and how it depends on a variety of factors (such as the punishment cost, 
punishment fine, and synergy factor). Going one step further, evolutionary game theory may even prove useful 
to understand how moral feelings have evolved. 

Furthermore, it would be interesting to investigate the emergence of punishment within the framework of 
a coevolutionary model [55-57J, where both, individual strategies and punishment levels are simultaneously 
spread. Such a model could, for example, assume that individuals show some exploration behavior [51J and stick 
to successful punishment levels for a long time, while they quickly abandon unsuccessful ones. In the beginning 
of this coevolutionary process, costly punishment would not pay off. However, after a sufficiently long time, 
mutually fitting punishment strategies are expected to appear in the same neighborhood by coincidence [50J. 
Once an over-critical number of successful punishment strategies have appeared in some area of the simulated 
space, they are eventually expected to spread. The consideration of success-driven migration should strongly 
support this process [50]. Over many generations, genetic-cultural coevolution could finally inherit costly 
punishment as a behavioral trait, as is suggested by the mechanisms of strong reciprocity [58] . 

Methods 

We study the public goods game with punishment. Cooperative individuals (C and M) make a contribution of 
1 to the public good, while defecting individuals (D and I) contribute nothing. The sum of all contributions 
is multiplied by r and the resulting amount equally split among the k + 1 interacting individuals. A defecting 
individual (D or I) suffers a fine j3/k by each punisher among the interaction partners, and each punishment 
requires a punisher (M or I) to spend a cost j/k on each defecting individual among the interaction partners. 
In other words, only defectors and punishing defectors (immoralists) are punished, and the overall punishment 
is proportional to the sum of moralists and immoralists among the k neighbors. The scaling by k serves to 
make our results comparable with models studying different groups sizes. 

Denoting the number of so defined cooperators, defectors, moralists, and immoralists among the k 
interaction partners by Nc, Njj, Nm and Nj, respectively, an individual obtains the following payoff: 
If it is a cooperator, it gets Pq = r(Nc + Nm + l)/(fc + 1) — 1, if a defector, the payoff is P& = 
r(N c + N M )/(k + 1) - (3(N AI + Ni)/k, a moralist receives P M = P c - j(N D + N T )/k, and an immoralist 
obtains Pj = Pq — j(Nd + Nj)/k. Our model of the spatial variant of this game studies interactions in a 
simple social network allowing for clustering. It assumes that individuals are distributed on a square lattice 
with periodic boundary conditions and play a public goods game with k = 4 neighbors. We work with a fully 
occupied lattice of size L x L with L = 200. ..1200 in Fig. 1 and L = 100 in Figs. 2-4 (the lattice size must 
be large enough to avoid an accidental extinction of a strategy). The initial strategies of the L 2 individuals 
are equally and uniformly distributed. Then, we perform a random sequential update. The individual at the 
randomly chosen location x belongs to five groups. (It is the focal individual of a Moore neighborhood and a 
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member of the Moore neighborhoods of four nearest neighbors). It plays the public goods game with the k 
interaction partners of a group g, and obtains a payoff P£ in all 5 groups it belongs to. The overall payoff is 
P x = J2 g Pi- Next, one of the four nearest neighbors is randomly chosen. Its location shall be denoted by y 
and its overall payoff by P y . This neighbor imitates the strategy of the individual at location x with probability 
q = 1/{1 + exp[(P y — P x )/K]} [45]. That is, individuals tend to imitate better performing strategies in their 
neighborhood, but sometimes deviate (due to trial-and-error behavior or mistakes) [31J. Realistic noise levels 
lie between the two extremes K — > (corresponding to unconditional imitation by the neighbor, whenever 
the overall payoff P x is higher than P y ) and K — > oo (where the strategy is copied with probability 1/2, 
independently of the payoffs). For the noise level K = 0.5 chosen in our study, the evolutionary selection 
pressure is high enough to eventually eliminate poorly performing strategies in favor of strategies with a higher 
overall payoff. This implies that the resulting frequency distribution of strategies in a large enough lattice is 
independent of the specific initial condition after a sufficiently long transient time. Close to the separating 
line between M and D+M in Fig. 1, the equilibration may require up to 10 7 iterations (involving L 2 updates 
each). 

Acknowledgments 

We acknowledge partial financial support from the EU Project QLectives and the ETH Competence Center 
"Coping with Crises in Complex Socio-Economic Systems" (CCSS) through ETH Research Grant CH1-01 08-2 
(D.H.), from the Hungarian National Research Fund (grant K-73449 to A.S. and G.S.), the Bolyai Research 
Grant (to A.S.), the Slovenian Research Agency (grant Zl-2032-2547 to M.P.), and the Slovene-Hungarian 
bilateral incentive (grant BI-HU/09-10-001 to A.S., M.P. and G.S.). D.H. would like to thank for useful 
comments by Carlos P. Roca, Moez Draief, Stefano Balietti, Thomas Chadefaux, and Sergi Lozano. 

References 

1. Hardin G (1968) The tragedy of the commons. Science 162: 1243-1248. 

2. Fehr E, Gachter S (2002) Altruistic punishment in humans. Nature 415: 137-140. 

3. Fehr E, Fischbacher U (2003) The nature of human altruism. Nature 425: 785-791. 

4. Hammerstein P, ed (2003) Genetic and Cultural Evolution of Cooperation (MIT Press, Cambridge, 
MA). 

5. Nakamaru M, Iwasa Y (2006) The evolution of altruism and punishment: Role of selfish punisher. J 
Theor Biol 240: 475-488. 

6. Camerer CF, Fehr E (2006) When does "economic man" dominate social behavior. Science 311: 47-52. 

7. Gurerk O, Irlenbusch B, Rockenbach B (2006) The competitive advantage of sanctioning institutions. 
Science 312: 108-111. 

8. Sigmund K, Hauert C, Nowak MA (2001) Reward and punishment. Proc. Natl. Acad. Sci. USA 98:10757- 
10762. 

9. Henrich J, Boyd R (2001) Why people punish defectors. J Theor Biol 208: 79-89 

10. Boyd R, Gintis H, Bowles S, Richerson PJ (2003) The evolution of altruistic punishment. Proc Natl 
Acad Sci USA 100: 3531-3535. 



Evolutionary establishment of moral. 



8 



11. Brandt H, Hauert C, Sigmund K (2003) Punishing and reputation in spatial public goods games. Proc 
R Soc Lond Ser B 270: 1099-1104. 

12. T. Yamagishi (1986) The provision of a sanctioning system as a public good. Journal of Personality and 
Social Psychology 51: 110-116. 

13. Fehr E (2004) Don't lose your reputation. Nature 432: 449-450. 

14. Colman AM (2006) The puzzle of cooperation. Nature 440: 744-745. 

15. Fowler JH (2005) Second-order free-riding problem solved? Nature 437: E8-E8. 

16. Panchanathan K, Boyd R (2005) Reply. Nature 437: E8-E9. 

17. Panchanathan K, Boyd R (2004) Indirect reciprocity can stabilize cooperation without the second-order 
free rider problem. Nature 432: 499-502. 

18. Milinski M, Semmann D, Krambeck H-J (2002) Reputation helps to solve the "tragedy of the commons". 
Nature 415: 424-426. 

19. Fowler JH (2005) Altruistic punishment and the origin of cooperation. Proc Natl Acad Sci USA 102: 
7047-7049. 

20. Brandt H, Hauert C, Sigmund K (2006) Punishing and abstaining for public goods. Proc Natl Acad Sci 
USA 103: 495-497. 

21. Hauert C, Traulsen A, Brandt H, Nowak MA, Sigmund K (2007) Via freedom to coercion: The emer- 
gence of costly punishment. Science 316: 1905-1907. 

22. Hauert C, De Monte S, Hofbauer J, Sigmund K (2002) Volunteering as red queen mechanism for 
cooperation in public goods game. Science 296: 1129-1132. 

23. Semmann D, Krambeck H-J, Milinski M (2003) Volunteering leads to rock-paper-scissors dynamics in 
a public goods game. Nature 425: 390-393. 

24. Milinski M, Sommerfeld RD, Krambeck HJ, Reed FA, Marotzke J (2008) The collective-risk social 
dilemma and the prevention of simulated dangerous climate change. Proc Natl Acad Sci USA 105: 
2291-2294. 

25. Sigmund K (2010) The Calculus of Selfishness (Princeton University Press, Princeton). 

26. Heckathorn DD (1996) The dynamics and dilemmas of collective action. American Sociological Review 
61: 250-277. 

27. Eldakar OT, Wilson DS (2008) Selfishness as second-order altruism. Proc. Natl. Acad. Sci. USA 109: 
6982-6986. 

28. Nakamaru M, Iwasa Y (2005) The evolution of altruism by costly punishment in the lattice structured 
population: Score-dependent viability versus score-dependent fertility. Evolutionary Ecology Research 
7: 853-870. 

29. Sekiguchi T, Nakamaru M (2009) Effect of the presence of empty sites on the evolution of cooperation 
by costly punishment in spatial games. Journal of Theoretical Biology 256(2): 297-304. 

30. Nowak MA (2006) Five rules for the evolution of cooperation. Science 314, 1560-1563. 



Evolutionary establishment of moral. 



9 



31. Traulsen A, et al. (2010) Human strategy updating in evolutionary games. Proc. Natl. Acad. Sci. USA 
107, 2962-2966. 

32. Nowak MA, Tarnita CE, Antal T (2010) Evolutionary dynamics in structured populations. Phil. Trans. 
R. Soc. B 365: 19-30. 

33. Hauser M D (2006) Moral Minds: How Nature Designed Our Universal Sense of Right and Wrong 
(Ecco, New York). 

34. Falk A, Fehr E, Fischbacher U (2005) Driving forces behind informal sanctions. Econometrica 73: 
2017-2030. 

35. Shinada M, Yamagishi T, Ohmura Y (2004) False friends are worse than bitter enemies: "Altruistic" 
punishment of in-group members. Evol. Hum. Behav. 25: 379-393. 

36. Szabo G, Hauert C (2002) Phase transitions and volunteering in spatial public goods games. Phys Rev 
Lett 89: 118101. 

37. Santos FC, Santos MD, Pacheco JM (2008) Social diversity promotes the emergence of cooperation in 
public goods games. Nature 454: 213-216. 

38. Wakano JY, Nowak MA, Hauert C (2009) Spatial dynamics of ecological public goods. Proc Natl Acad 
Sci USA 106: 7910-7914. 

39. Nowak MA, May RM (1992) Evolutionary games and spatial chaos. Nature 359: 826-829. 

40. Nowak MA, Bonhoeffer S, May, RM (1994) More spatial games. Int J Bifurcat Chaos 4: 33-56. 

41. Nowak MA, Bonhoeffer S, May RM (1994) Spatial games and the maintenance of cooperation. Proc 
Natl Acad Sci USA 91: 4877-4881. 

42. Nowak MA, Bonhoeffer S, May RM (1996) Robustness of cooperation. Nature 379, 125-126. 

43. Nathanson CG, Tarnita CE, Nowak MA (2009) Calculating evolutionary dynamics in structured popu- 
lations. PLoS Comput Biol 5: el000615. 

44. Pacheco JM, Pinheiro FL, Santos FC (2009) Population structure induces a symmetry breaking favoring 
the emergence of cooperation. PLoS Comput Biol 5: el000596. 

45. Szabo G, Toke C (1998) Evolutionary prisoner's dilemma game on a square lattice. Phys Rev E 58: 
69-73 

46. Dornic I, Chate H, Chave J, Hinrichsen H (2001) Critical coarsening without surface tension: The 
universality class of the voter model. Phys Rev Lett 87, 045701. 

47. Flache A, Hegselmann R (2001) Do irregular grids make a difference? Relaxing the spatial regularity 
assumption in cellular models of social dynamics. Journal of Artificial Societies and Social Simulation 



4: 4, see http://www.soc.surrey.ac.Uk/JASSS/4/4/6.html 



48. Szabo G, Szolnoki A, Izsak R (2004) Rock-scissors-paper game on regular small-world networks. J. 
Phys. A: Math. Gen. 37: 2599-2609. 

49. Axelrod R (1984) The Evolution of Cooperation (Basic Books, New York). 

50. Helbing D, Yu W (2009) The outbreak of cooperation among success-driven individuals under noisy 
conditions. Proc Natl Acad Sci USA 106: 3680-3685. 



Evolutionary establishment of moral. 



10 



51. Traulsen A, Hauert C, Silva HD, Nowak MA, Sigmund K (2009) Exploration dynamics in evolutionary 
games. Proc Natl Acad Sci USA 106: 709-712. 

52. Helbing D, Szolnoki A, Perc M, Szabo G (2010) Defector-accelerated cooperativeness in public goods 
games with punishment and mutations, in preparation. 

53. Herrmann B, Thoni C, Gachter S (2008) Antisocial punishment across societies. Science 319: 1362- 
1367. 

54. Rand DG, Ohtsuki H, Nowak MA (2009) Direct reciprocity with costly punishment: Generous tit-for-tag 
prevails. J Theor Biol 256: 45-57. 

55. Szabo G, Szolnoki A, Jeromos V (2009) Selection of dynamical rules in spatial Prisoner's Dilemma 
games. EPL 87: 18007. 

56. Santos FC, Pacheco JM, Lenaerts T (2006) Cooperation prevails when individuals adjust their social 
ties. PLoS Comput Biol 2: 1284-1291. 

57. Perc M, Szolnoki A (2010) Coevolutionary games - A mini review. BioSystems 99: 109-125. 

58. Bowles S, Gintis H (2004) The evolution of strong reciprocity: Cooperation in heterogeneous popula- 
tions. Theor Popul Biol 65: 17-28. 

59. Helbing D, Szolnoki A, Perc M, Szabo G (2010) Spreading of costly punishment and evolutionary 
resonance in the spatial public goods game, in preparation. 



Evolutionary establishment of moral. 



11 




Figure 1. Phase diagrams showing the remaining strategies in the spatial public goods 
game with cooperators (C), defectors (D), moralists (M) and immoralists (I), after a 
sufficiently long transient time. Initially, each of the four strategies occupies 25% of the sites of the 
square lattice, and their distribution is uniform in space. However, due to their evolutionary 
competition, two or three strategies die out after some time. The finally resulting state depends on the 
synergy r of cooperation, the punishment cost 7, and the punishment fine (3. The displayed phase 
diagrams are for (a) r = 2.0, (b) r = 3.5, and (d) r = 4.4. (d) Enlargement of the small-cost area for 
r = 3.5. Solid separating lines indicate that the resulting fractions of all strategies change continuously 
with a modification of the model parameters /? and 7, while broken lines correspond to discontinuous 
changes. All diagrams show that cooperators cannot stop the spreading of moralists, if only the 
finc-to-cost ratio is large enough. Furthermore, there are parameter regions where moralist can crowd 
out cooperators in the presence of defectors. Note that the spreading of moralists is extremely slow and 
follows a voter model kind of dynamics [46j . if their competition with cooperators occurs in the absence 
of defectors. Therefore, computer simulations had to be run over extremely long times (up to 10 7 
iterations for a systems size of 400 x 400). For similar reasons, a small level of strategy mutations 
(which permanently creates a small number of strategies of all kinds, in particular defectors) can largely 
accelerate the spreading of moralists in the M phase, while it does not significantly change the resulting 
fractions of the four strategies [52] . The existence of immoralists is usually not relevant for the outcome 
of the evolutionary dynamics. Apart from a very small parameter area, where immoralists and 
moralists coexist, immoralists arc quickly extinct. Therefore, the 4-stratcgy model usually behaves like 
a model with the three strategies C, D, and M only. As a consequence, the phase diagrams for the 
latter look almost the same like the ones presented here [59] ■ 
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Figure 2. Elimination of second-order free-riders (non-punishing cooperators) in the 
spatial public goods game with costly punishment for r = 4.4, (5 = 0.1, and 7 = 0.1. (a) 

Initially, at time t = 0, cooperators (blue), defectors (red), moralists (green) and immoralists (yellow) 
are uniformly distributed over the spatial lattice, (b) After a short time period (here, at t = 10), 
defectors prevail, (c) After 100 iterations, immoralists have almost disappeared, and cooperators 
prevail, since cooperators earn high payoffs when organized in clusters, (d) At t = 500, there is a 
segregation of moralists and cooperators, with defectors in between, (e) The evolutionary battle 
continues between cooperators and defectors on the one hand, and defectors and moralists on the other 
hand (here at t = 1000). (f) At t ~ 2000, cooperators have been eliminated by defectors, and a small 
fraction of defectors survives among a large majority of moralists. Interestingly, each strategy (apart 
from I) has a time period during which it prevails, but only moralists can maintain their majority. 
While moralists perform poorly in the beginning, they are doing well in the end. We refer to this as the 
"who laughs last laughs best" effect. 
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Figure 3. Coexistence of moralists and immoralists for r = 3.5, (3 = 0.12, and 7 = 0.005, 
supporting the occurrence of individuals with 'double moral standards' (who punish 
defectors, while defecting themselves), (a) Initially, at time t = 0, cooperators (blue), defectors 
(red), moralists (green) and immoralists (yellow) are uniformly distributed over the spatial lattice, (b) 
After 250 iterations, cooperators have been eliminated in the competition with defectors (as the synergy 
effect r of cooperation is not large enough), and defectors are prevailing, (c-e) The snapshots at 
t = 760, t = 2250, and t = 6000 show the interdependence of moralists and immoralists, which appears 
like a tacit collaboration. It is visible that the two punishing strategies win the struggle with defectors 
by staying together. On the one hand, due to the additional punishment cost, immoralists can survive 
the competition with defectors only by exploiting moralists. On the other hand, immoralists support 
moralists in fighting defectors, (f) After 12000 iterations, defectors have disappeared completely, 
leading to a coexistence of clusters of moralists with immoralists. 
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Figure 4. Dependence of cluster shapes on the punishment fine ft in the stationary state, 
supporting an adaptive balance between the payoffs of two different strategies at the 
interface between competing clusters. Snapshots in the top row were obtained for low punishment 
fines, while the bottom row depicts results obtained for higher values of f3. (a) Coexistence of moralists 
and defectors for a synergy factor r — 3.5, punishment cost 7 = 0.20, and punishment fine /3 = 0.25. (b) 
Same parameters, apart from j3 = 0.4. (c) Coexistence of moralists and immoralists for r = 3.5, 
7 = 0.05, and /3 = 0.12. (d) Same parameters, apart from /3 = 0.25. A similar change in the cluster 
shapes is found for the coexistence of cooperators and defectors, if the synergy factor r is varied. 
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Figure 5. Resulting fractions of the four strategies C, D, I, and M, for random regular 
graphs as a function of the punishment fine /?. The graphs were constructed by rewiring links of 
a square lattice of size 400 x 400 with probability Q, thereby preserving the degree distribution (i.e. 
every player has 4 nearest neighbors) [48]. For small values of Q, small- world properties result, while for 
Q — > 1, we have a random regular graph. By keeping the degree distribution fixed, we can study the 
impact of randomness in the network structure independently of other effects. An inhomogeneous 
degree distribution can further promote cooperation [37]. The results displayed here are averages over 
10 simulation runs for the model parameters r = 3.5, 7 = 0.05, and Q = 0.99. Similar results can be 
obtained also for other parameter combinations. 



